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PERFORMANCE  AT  VISUAL  PATTERN  MATCHING  TASKS 


Mark  W.  Cannon,  Jr. 

Aerospace  Medical  Research  Laboratory 
Wright-Pattcrson  Air  Force  Base,  Ohio 


Summary 

A model  for  simulating  human  performance  in 
visual  pattern  matching  tasks  is  presented.  The 
model  is  based  on  evidence  of  spatial  frequency 
processing  in  the  visual  system,  and  on  the  hypothe- 
sis that  shape  recognition  is  determined  only  by  the 
low  spatial  frequency  harmonics  of  the  image.  Two 
psychophysical  pattern  matching  experiments  are  de- 
scribed that  den-ionstrate  a clear  functional  relation- 
ship between  the  "similarity"  of  two  patterns  as 
judged  by  human  observers  and  the  Euclidean  distance 
between  spatially  filtered  Fourier  transforms  of  the 
patterns. 

Introduction 

The  research  described  in  this  report  repre- 
sents a portion  of  the  work  being  conducted  at  the 
Aerospace  Medical  Research  Laboratory  to  develop 
quantitative  models  for  observer-display  interac- 
tions. These  models  will  lead  to  design  of  displays 
optimally  matched  to  human  information  processing 
capabilities  under  a variety  of  conditions.  This 
paper  addresses,  in  particular,  the  problem  of  pre- 
dicting the  confusabi 1 i ty  of  symbols  of  the  type 
that  may  be  used  in  a graphic  display.  Alphanumeric 
symbols  are  used  for  the  tests  discussed  here  but 
the  technique  is  not  limited  to  these.  Any  two- 
dimensional  display  symbols  can  be  analyzed  by  this 
technique. 

Two  bodies  of  research  have  helped  to  lead  our 
work  in  its  present  direction.  The  first  of  these  is 
the  increasing  amount  of  psychophysical  and  neuro- 
physiological literature  demonstrating  the  organiza- 
tion of  the  visual  system  as  a spatial  frequency 
analyzer.  These  works  range  from  the  early  reports 
of  Campbell  et  al.*’2’3  to  same  of  the  more  recent 
works  by  Hamerly,  Quick  and  Reichert**  and  by 
Carlson,  Cohen  and  Gorog^.  These  latter  papers 
demonstrate  clearly  that  from  threshold  to  contrasts 
of  at  least  l|OI,  the  frequency  analysis  properties 
of  the  visual  system  can  be  closely  approximated  by 
linear  mathematics. 


The  second  body  of  work  which  has  Influenced 
our  research  is  the  application  of  spatial  frequency 
analysis  to  the  recognition  of  two-dimensionai  images 
by  Kabrisky®’^  and  his  students  at  the  Air  Force 
Institute  of  Technology.  In  a number  of  Masters 
theses®  and  Doctoral  dissertations®  over  the  past 
10  years  this  group  has  demonstrated  the  machine 
recognition  of  printed  cliaracters  can  achieve  a cer- 
tain amount  of  font  independence  if  only  the  low 
spatial  frequencies  (out  to  about  the  3rd  harmonic 
of  the  character  width)  are  used  in  both  prototypes 
and  test  characters.  A simplified  model  of  static 
shape  recognition  In  the  visual  system  that  evolves 
from  the  synthesis  of  these  two  bobies  of  work  is 
described  below. 

Ir.vsges  of  objects  (Inputs)  are  formed  on  the 
retina  and  transmitted  into  the  visu.il  system  via 
the  optic  nerve.  At  some  point,  (perhaps  even  in 
the  retina)  a two-dimensionai  spatial  frequency 
transformit ion  Is  performed  on  the  Input  insige. 

Many  subsequent  cognitive  processes  ha’.'C  access  to 
all  spatial  frequency  comperents,  but  the  process 


that  Identifies  what  object  is  present  requires  only 
the  low  spatial  harmonics  of  the  image.  In  this 
identification  process,  the  low-pass  filtered  input 
image  is  treated  as  a multidimensional  vector.  This 
vector  is  compared  v;ith  a set  of  stored  prototype 
vectors  derived  from  a set  of  previously  learned  low- 
pass  filtered  images.  The  input  vector  is  identified 
as  belonging  to  that  class  of  objects  represented  by 
the  nearest  prototype.  (Nearest  is  defined  as  the 
shortest  Euclidean  distance  in  the  pattern  space  co- 
ordinates.) Verification  of  this  mode!  requires  the 
demonstration  of  a functional  relationship  between 
Euclidean  distance  derived  from  model  predictions  on 
a set  of  p-3tterns  with  similarity  judgements  or  mis- 
class if icat ion  probabilities  derived  from  psycho- 
physical experiments  involving  the  same  set  of 
patterns.  We  have  been  able  to  demonstrate  such  a 
functional  relationship  using  sets  of  alphanumeric 
characters  as  our  test  patterns.  The  remainder  of 
this  paper  is  devoted  to  describing  two  types  of 
psychophysical  tests  and  associated  simulation 
results  which  provide  a strong  partial  validation 
of  the  model . 


Description  of  Experiments 
Computer  Analysis  of  Symbols 

A description  of  the  computer  processing  per- 
formed on  the  symbol  sets  is  in  order  at  this  point, 
since  essentially  the  same  model  distance  predictions 
are  used  to  analyze  both  psychophysical  experiments. 
The  symbols  used  were  digitized  as  ones  on  a back- 
ground of  zeros  for  computer  analysis.  Each  symbol, 
except  for  a few  in  set  1,  has  a maximum  size  of 
10  X Ilf  points,  and  is  located  in  a 3?  x 32  back- 
ground window  of  zeros.  The  four  symbol  sets  used  in 
these  experiments  are  shown  in  the  appendix  in  the 
same  form  in  which  they  were  presented  to  the  human 
observers.  The  sawtooth  effect  of  digitized  diago- 
nal lines  was  therefore  presented  to  both  humans 
and  computer. 


In  both  experiments,  the  same  symbol  set  was 
used  as  both  input  and  prototype.  As  we  will  see, 
this  Is  fully  justified  in  experiment  1,  and  is  the 
best  we  can  do  at  present  for  experiment  2.  The  36 
symbols  were  Fourier  transformed  using  a two- 
dimensional  Cooly-Tukey  FFT  algorithm  and  filtered 
using  a square  low-pass  filter.  The  dc  term  and  all 
harmonic  coniponents  up  to  the  maximum  desired  were 
saved  while  all  higher  terms  were  set  equal  to  zero. 
Next,  a 36  X 36  correlation  matrix  v/as  generated. 

Each  row  of  this  matrix  contained  the  maximum  value 
of  the  cross-correlation  functions  computed  between 
the  filtered  input  symbol  and  each  of  the  other  sym- 
bols of  the  set.  The  correlation  function  for  a 
symbol  pair  (i,j)  was  determined  by  multiplying  the 
filtered  spectrum  of  symbol  i with  the  complex  conju- 
gate of  the  filtered  spectrum  of  symbol  j and  taking 
the  Inverse  transform  of  the  product.  The  maximum 
amplitude  of  this  inverse  transform  was  the  rxiximum 
coi relation  coefficient.  Finally,  a distance  matrix 
D was  computed  from  the  correlation  matrix  by  deter- 
mining the  Euclidean  distances  at  which  maximum  cor- 
relation occurred.  This  distance  is  the  minimum  mean 
square  distance  between  the  two  patterns,  and  can  be 
derived  from  the  riaxir.ium  correlation  coefficient,  as 
we  now  show.  Let  X and  V be  the  position  vectors  of 
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two  pstterns  In  the  spatial  domain  and  let  d be  the 
distance  between  X and  Y. 

d - jX  - Yj  - /(X  - Y)  ■ (X  - Y)  (1) 

d - /(X  • X - 2X  • Y + Y • Y)  (2) 

The  vectors  have  been  energy-normal Ized  after  fil- 
tering so  the  dot  products  are  equal  to  I and  the 
distance  Is 

d = /2  - 2p  (3) 

where  p Is  the  cross-correlation  between  X and  Y. 

If  we  let  p be  the  maximum  of  the  cross-correlation 
function,  the  distance  d is  minimized.  The  value  of 
d can  range  from  zero  to  1.4,  since  p ranges  from 
zero  to  I.  The  36  x 36  distance  matrix  D contained 
these  minimum  distances  between  each  input  symbol  and 
all  other  symbols  of  the  set.  Diagonal  elements 
were  zero,  since  these  represent  the  distance  from 
each  symbol  to  itself. 

Experiment  I:  Shape  Matching 

The  subjects  were  seated  before  a chart  contain- 
ing symbols  of  font  1,  2 or  3,  as  shown  in  the 
appendix,  but  the  symbols  on  the  experimental  charts 
were  arrayed  in  a random  order.  The  relative  dis- 
tances between  the  symbols  was  larger  than  that 
shown  on  the  charts  in  the  appendix  and  the  symbols 
themselves  were  2.5  cm  wide.  The  subject  was  seated 
at  a distance  from  the  chart  such  that  the  symbol 
width  was  one  degree  of  visual  angle.  Subjects  were 
given  a randomized  list  naming  all  36  symbols  in  the 
set.  The  subjects  were  instructed  to  locate  on  the 
chart  a symbol  named  in  the  random  list  and  then  to 
report  which  other  symbol  on  the  chart  matched  It 
most  closely  in  shape. 

In  the  account  below,  we  will  refer  to  the  sym- 
bol to  be  matched  as  the  test  symbol  and  the  symbol 
chosen  to  match  it  as  the  co.mparison  symbol.  Set  1 
was  viewed  by  35  subjects;  set  2 by  23  subjects  and 
set  3 by  27  subjects.  The  results  showed  that  for 
each  symbol  of  the  test  set,  there  was  a good  deal 
of  agreement  about  which  comparison  symbol  was 
closest.  If  the  comparison  symbols  for  each  test 
symbol  are  arranged  in  rank  order  by  the  number  of 
times  It  was  chosen  as  closest,  and  the  number  in 
each  rank  is  summed  over  all  three  sets  as  in  Fig.  1, 
we  see  that  approximately  50%  of  the  choices  are  in 
rank  I.  In  fact,  the  curve  is  nearly  perfectly 
exponential  by  rank.  In  rank  2,  25%  of  the  people 
agree  on  the  closest  match  and  In  rank  3 approxi- 
mately 12%  agree  that  this  Is  the  closest  match. 

Since  75%  of  the  subjects  agree  that  the  rank  1 
or  2 choices  arc  the  closest  matches,  we  decided  that 
our  computer  model  would  be  given  credit  for  a cor- 
rect choice  if  the  comparison  symbol  picked  by  the 
model  as  closest  to  each  test  symbol  agreed  with 
either  the  human  rank  I or  2 choice. 

The  results  of  the  computer  choices  for  a range 
of  filter  bandwidths  is  shown  In  Fig.  2.  The  ordi- 
nate gives  the  number  of  times  the  model  agreed  with 
human  predictions.  The  best  score  across  all  three 
sets  occurs  at  the  eighth  harmonic  of  the  32  x 32 
viev/ing  window,  which  is  between  the  second  and 
third  harmonic  of  the  symbol.  The  score  at  the 
sixth  harmonic  of  the  window,  which  is  equal  to 
the  second  harmonic  of  the  symbol,  is  just  marginally 
lower,  so  maximum  performance  occurs  in  the  sixth 
to  eighth  harmonic  range.  The  overall  score  is  an 
80%  correct  match  with  the  human  data.  Note  also 
that  performance  deteriorates  at  both  larger  and 
smaller  filter  bandwidths.  We  see  that  those  sym- 
bol pairs  vfith  smallest  Intersymbol  distances  as 
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Fig.  I.  A symbol  rank  is  determined  by  how 
many  subjects  chose  it  as  a closest  match  to 


the  test  symbol.  This  figure  shows  how  many 
subjects  chose  symbols  in  each  rank  averaged 
across  all  symbols  In  the  set.  The  solid  line 
connects  the  average  of  each  rank. 
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Fig.  2.  The  sample  points  represent  per- 
centage of  model  choices  for  closest  match 
to  a test  symbol  that  agreed  with  either  rank 
I or  2 choices  of  human  sibjects.  These  are 
plotted  as  a function  of  the  filter  bandwidth 
as  explained  in  the  text.  The  solid  line  is 
an  average  of  the  sample  points. 


determined  by  our  model  correspond  quite  well  with  | 
human  judgments  of  symbol  pairs  most  similar  in  | 
shape.  However,  if  shape  similarity  is  related  to  j 
intersymbol  distance,  there  should  be  a functional  j 
relationship  between  the  number  of  times  a comparison  I 
symbol  is  chosen  ns  most  similar  to  a test  syribol  and  I 
the  Euclidean  distance  between  test  and  comparison  | 
symbols.  This  relationship  is  derived  In  the  fol-  I 
lowing  way.  | 


The  human  matching  results  for  a particular  sym- 
bol set  are  arranged  Into  a 36  x 36  choice  matrix  C. 
Each  entry  Cij  Is  the  number  of  times  that  subjects 
those  a comparison  symbol  J as  most  similar  to  test 
symbol  I.  The  intersymbol  distances  corresponding 
to  each  I ,J  pair  are  contained  in  previously  com- 
puted distance  matrix  D for  this  symbol  set,  A 
filter  corresponding  to  the  sixth  harmonic  of  the 
window  was  used  to  compute  D.  Let  us  now  divide  up 
the  distance  axis  Into  bins  of  width  .05,  Using 
matrices  C and  D,  we  add  up  ail  choices  that  fall  in 
a given  distance  bin  and  divide  by  the  number  of  test 
symbols  that  generated  choicer  in  that  bin.  This 
gives  us  the  average  number  of  choices  per  test  sym- 
bol at  a given  distance  between  test  and  comparison 
symbols.  Further  normalization  was  accomplished  for 
each  set  by  dividing  these  averages  by  the  number  of 
subjects  who  took  part  for  that  set.  The  final 
averages  for  al 1 three  sets  are  plotted  as  a func- 
tion of  distance  in  Fig.  3-  The  points  from  the 
three  sets  overlap  in  a very  satisfactory  manner, 
almost  as  if  all  were  derived  from  the  same  function. 
Thus  Fig.  3 demonstrates,  as  we  had  hoped,  that  the 
relative  number  of  times  a particular  comparison 
symbol  is  chosen  to  be  a best  match  for  a given  test 
symbol  decreases  with  increasing  distance  between 
comparison  and  test  symbols.  Apparently,  similarity 
and  Euclidean  distance  are  functionally  related  for 
our  spatially  filtered  symbols. 


o MATCHES  VS  DISTANCE 


DISTANCE 


Fig.  3.  The  sample  points  represent  the 
average  number  of  times  per  test  symbol  that 
a given  comparison  symbol  vjas  picked  as  a 
closest  match  in  shape  to  that  test  symbol. 

The  abscissa  Is  the  model -predicted  Euclidean 
distance  betvyecn  spatially  filtered  test  and 
comparison  symbols.  The  solid  line  is  the 
average  of  the  sample  points,  and  shows  a defi 
nite  decrease  with  distance  between  symbols. 


Experiment  2:  Symbol  Recognition 

In  a series  of  psychophysical  experiments  con- 
ducted at  the  Air  Force  Flight  Dynamics  Laboratory, 
Dr.  Larry  Goble  has  evaluated  the  confusabi 1 i ty  of 
three  of  the  symbol  sets  shown  in  the  appendix.  Sets 
2,  3 and  I|  were  used  in  his  experiment.  The  subjects 
viev/ed  the  sets  statically  to  become  familiar  with 


the  shapes.  They  were  then  asked  to  identify  the 
symbols  when  they  were  flashed  on  a screen  and  par- 
tially masked  by  a preceding  and  following  uniform 
field  of  the  same  Intensity  as  the  symbol.  The 
paradigm  proceeded  as  follows:  uniform  field  ID  msec, 
blank  screen  5 msec,  symbol  10  msec,  blank  screen 
5 msec  and  uniform  field  ID  msec.  This  paradigm 
Involves  some  significant  differences  from  the 
matching  experiment  covered  in  the  previous  sec- 
tion. First,  the  short  duration  of  the  symbol  pre- 
sentation and  the  masking  involve  some  temporal 
input  parameters  not  yet  treated  in  the  model.  We 
cannot  yet  say  what  effects  stimulus  duration  would 
have  on  the  complicated  spectrum  of  a symbol,  and  we 
cannot  adequately  define  the  effect  of  pre-  and  post- 
stimulus masking.  We  will  Just  assume  these  effects 
are  small  when  we  apply  the  model  to  analyze  the 
results  of  these  psychophysical  experiments.  The 
second  problem  is  the  prototype  to  which  the  test 
symbols  are  compared.  We  assume  that  the  subject 
can  learn  the  symbol  shapes  for  each  set  v;ell  enough 
to  compare  the  inputs  to  them.  However,  the  proto- 
type may  be  some  combination  of  a wide  variety  of 
fonts  to  which  the  subject  has  been  exposed.  The 
fact  that  the  subject  can  change  his  prototype  set 
was  demonstrated  by  Goble's  data.  Each  subject 
viewed  each  symbol  set  ii8  times  and  set  J4  showed 
a distinct  learning  curve  measured  in  number  of  cor- 
rect responses.  The  other  two  sets  showed  trial-to- 
trial  variation,  but  the  average  number  of  correct 
responses  remained  relatively  constant  over  all 
trials. 

The  first  comparison  of  model  performance  to 
human  data  was  very  similar  to  the  comparison  made 
in  Fig.  2.  The  symbol  with  the  smallest  intersymbol 
distance  in  the  ith  row  of  the  distance  matrix  D was 
picked  as  the  model's  choice  for  the  symbol  most  con- 
fusable  with  the  ith  test  symbol.  We  asked,  "In  how 
many  cases  does  this  choice  agree  with  the  symbol 
that  produced  either  the  largest  or  second  largest 
number  of  human  errors  in  each  row7"  The  results 
are  plotted  in  Fig.  k.  The  plot  shows  the  number  of 
test  symbols  for  which  model  predictions  agreed  with 
human  data  as  a function  of  the  filter  bandwidth. 

The  striking  difference  between  these  plots  and  those 
of  Fig.  2 is  that  they  reach  a maximum  percentage  of 
agreement  at  a filter  bandwidth  equal  to  the  fifth 
harmonic  of  the  window  and  are  essentially  flat  above 
that  bandwidth.  Note  also  that  the  average  curve  has 
a maximum  of  only  k7%  compared  to  80%  for  the  static 
matching  experiment.  However,  scanning  the  data 
showed  us  that  there  was  still  considerable  correla- 
tion between  the  distribution  of  errors  and  Euclidean 
distance.  In  Fig.  5 we  derive  a curve  giving  the 
average  number  of  human  errors  per  test  symbol  as  a 
function  of  the  model -generated  distance  between  the 
test  symbol  and  a comparison  symbol.  We  again  di- 
vided the  distance  axis  into  bins  of  width  .05.  We 
summed  the  number  of  errors  in  each  bin  and  divided 
by  the  number  of  test  symbols  for  which  errors  were 
generated  in  that  bin  and  by  the  total  number  of 
errors  for  all  symbols  of  that  set.  The  total  errors 
varied  considerably  over  the  three  sets.  . There  were 
1639  errors  for  set  2,  318*1  errors  for  set  3 and  2893 
errors  for  set  *l.  The  averages  for  each  set  were 
plotted  as  the  sample  points  in  Fig.  5,  and  we  see 
that  there  is  a definite  decrease  in  number  of  errors 
per  test  symbol  as  the  distance  between  test  and  com- 
parison symbol  increases.  Again,  all  three  symbol 
sets  give  results  which  fall  along  the  same  curve, 
independent  of  the  symbol  set,  depending  only  on 
EucI idean  distance. 


HINDOW  HARMONICS 

Fig.  4.  The  sample  points  represent  the  per- 
centage of  times  that  the  symbol  pairs  with 
smallest  model -derived  intersymbol  distances 
were  the  same  as  those  symbol  pairs  having  the 
largest  or  second  largest  number  of  confusions 
In  a symbol  identification  experiment.  The 
solid  line  Is  the  average  of  the  points.  The 
abscissa  gives  the  bandwidth  of  the  model  spa- 
tial frequency  filter  as  explained  in  the  text. 


well  in  attempting  to  predict  which  particular  sym- 
bol pairs  will  produce  the  most  errors  (Fig.  4) 
when  the  symbol  images  are  dynamic.  The  general 
trends  which  were  shown  for  the  static  experiment 
do  hold,  however,  and  we  may  say  that  on  the  average 
those  symbol  pairs  which  are  shown  to  be  close 
together  by  the  static  analysis  model  will  produce 
more  confusions  in  the  dynamic  case  than  those  symbol 
pairs  shown  to  be  far  apart  statically.  It  may  be 
possible  to  treat  the  uniform  masking  fields  as 
some  sort  of  noise  in  the  system,  since  it  is  their 
presence  that  causes  most  of  the  errors.  A number 
of  researchers^®’'*’*^  have  developed  models  of  tem- 
porary visual  memory  storage  which  may  be  applicable. 
Sperling,  in  particular,  has  demonstrated  that  pre- 
sentation of  a uniform  pre-  and  post-stimulus  masking 
field  interferes  with  the  storage  of  symbols  in  this 
temporary  memory  area.  However,  a much  more  general 
model  of  spatio-temporal  interactions  in  the  visual 
system  is  required  before  we  can  apply  this  promising 
approach  to  the  analysis  of  dynamic  display  imagery. 
The  results  of  experiments  by  Arend*®  have  demon- 
strated that  at  least  at  threshold,  there  is  a 
greater  decline  in  sensitivity  to  high  spatial 
frequencies  than  to  low  spatial  frequencies  as 
the  stimulus  duration  is  decreased  from  continuous 
to  20  msec.  Thus  the  distortion  in  the  spectrum  of 
the  input  symbol  image  is  due  to  the  temporal  re- 
sponse variation  annong  the  spatial  frequency  channels 
as  well  as  the  interference  of  the  uniform  field. 
Similar  spatio-temporal  effects  are  being  investi- 
gated by  our  laboratory,  and  should  supply  a firm 
base  on  which  to  build  a model  of  human  performance 
that  can  be  applied  to  dynamic  as  well  as  static 
imagery. 


o ERRORS  VS  DISTANCE 


Fig.  5.  The  sample  point  ordinates  represent  th 
number  of  times  per  test  symbol  that  a test 
symbol  was  mistaken  for  a particular  comparison 
symbol.  The  solid  line  is  the  average  of  the 
sample  points  and  demonstrates  a decrease  in 
the  number  of  classification  errors  with  an 
increase  in  the  model-predicted  distance 
between  the  symbols. 

Discussion  of  Results 

The  model  accounts  very  well  for  human  per- 
formance at  a pattern-matching  task  involving  static 
imagery.  Human  shape-similarity  judgments  can  be 
modeled  by  minimum  Euclidean  distances  provided  that 
the  imagery  Is  static.  The  model  does  not  do  so 
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